Statistical Analysis of CEUA Survey¶
Readme¶
Sumary¶
1. Installing external libraries needed¶
1.1 Importing common libraries for analysis¶
2. Loading data from GDrive¶
--- Methodological Summary & Plot Explanation ---
To identify similar rows, this script employs a standard NLP vector space model.
First, each row's text is converted into a numerical vector using a Bag-of-Ngrams approach
(counting both single words and two-word phrases).
The similarity between these vectors is then calculated using Cosine Similarity.
This metric evaluates the cosine of the angle between two vectors, effectively measuring
how similar their content is, irrespective of the total length of the text. A score
of 1.0 means the content is proportionally identical.
The plot shows the number of pairs found at different similarity cutoffs.
The "elbow" on this plot indicates a point of diminishing returns—the optimal cutoff
that captures the most significant duplicates without including too many dissimilar pairs.
We will proceed with an empirically chosen default threshold of 0.85. This value is
a common baseline as it typically represents a strong textual overlap, allowing for
minor variations (like typos or rephrasing) while still ensuring the core content is
the same.
Success! Spreadsheet 'Principal' loaded into a DataFrame.
--- Analyzing Similarity Thresholds ---
Plot generated to help choose a similarity cutoff. --- Finding pairs with similarity >= 0.85 --- Found 12 similar pairs. Suggested rows to remove (duplicates): 115, 118, 159, 17, 302, 343, 354, 365, 80, 85, 87 Full report saved to 'similar_rows_report.csv'
2.1 Loading data from cleaned up spreadsheet after the Qualitative Analysis¶
Open-ended survey responses were systematically analyzed to transform unstructured text into categorical variables for statistical analysis. Through an inductive coding process, categories and themes were derived directly from the response content to build a comprehensive codebook. This involved multi-dimensional coding for complex answers and synthesizing individual codes into broader thematic blocks. The final output was a new set of numerically coded variables, which formed the dataset for the quantitative analysis.
Success! Your spreadsheet has been loaded into a DataFrame.
2.2 Showing some lines of the dataset¶
| Cod. | Ativo | 3. Idade | 4. Genero | 5. Regiao | 6. Estado | 7. Cidade | 8. Religiao | Religiao_codificado(8) | 9. Vinculo | ... | 55. Suas_colocacoes_sao_respeitadas | 56. Ha_resistencia_as_suas_propostas | 57. Demais_membros_atribuem_o_mesmo_nivel_de_preocupacao | 58. Sente-se_estressado_nas_reunioes | 59. Receio_de_aceitar_ou_rejeitar_protocolo | 60. Membros_tem_receio_de_rejeitar_protocolos | 61. Ao_avaliar_projeto_de_membro_ele_deve_se_ausentar | 62. Membros_que_pesquisam_com_modelos_animais_minimizam_o_sofrimento_deles | 63. O_quanto_a_CEUA_se_baseia_no_principio_dos_3Rs_escala | 64. Comentario_sobre_CEUAs | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | TRUE | Entre 31 e 40 anos | Masculino | Sul | RS | Santa Maria | Catolicismo | 2 | Não | ... | Constantemente | Raramente | Constantemente | Nunca | Nunca | Nunca | Sim | Nunca | 5 | |
| 1 | 2 | TRUE | Entre 31 e 40 anos | Feminino | Sul | SC | Curitibanos | Não tenho religião | 5 | Não | ... | Constantemente | Nunca | Constantemente | Raramente | Nunca | Raramente | Sim | Raramente | 4 | |
| 2 | 3 | TRUE | Entre 51 e 60 anos | Feminino | Sul | RS | Cerro Largo | Evangélica | 3 | Não | ... | Frequentemente | Nunca | Frequentemente | Nunca | Nunca | Raramente | Sim | Raramente | 4 | |
| 3 | 4 | TRUE | Entre 41 e 50 anos | Feminino | Nordeste | SE | Aracaju | Não tenho religião | 5 | Não | ... | Constantemente | Raramente | Frequentemente | Frequentemente | Raramente | Nunca | Sim | Frequentemente | 3 | Percepo que as CEUAs ainda formalizam o uso d... |
| 4 | 5 | TRUE | Entre 61 e 70 anos | Feminino | Centro-oeste | DF | Brasília | Espiritismo | 4 | Não | ... | Constantemente | Raramente | Frequentemente | Nunca | Nunca | Nunca | Sim | Raramente | 4 | |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 376 | 377 | TRUE | Entre 31 e 40 anos | Feminino | Sul | PR | Realeza | Catolicismo | 2 | Sim | ... | Constantemente | Nunca | Frequentemente | Raramente | Nunca | Raramente | Sim | Raramente | 4 | |
| 377 | 378 | TRUE | Entre 51 e 60 anos | Feminino | Sudeste | MG | Uberaba | Espiritismo | 4 | Sim | ... | Frequentemente | Raramente | Frequentemente | Raramente | Nunca | Raramente | Sim | Frequentemente | 3 | Sinto-me gratificada por participar de uma CEU... |
| 378 | 379 | TRUE | Entre 41 e 50 anos | Feminino | Norte | PA | Belém | Cristã protestante | 3 | Não | ... | Constantemente | Nunca | Constantemente | Nunca | Nunca | Nunca | Sim | Constantemente | 5 | |
| 379 | 380 | TRUE | Entre 31 e 40 anos | Prefiro não responder | Sudeste | SP | São Paulo | Catolicismo | 2 | Não | ... | Frequentemente | Não sei dizer | Frequentemente | Raramente | Frequentemente | Raramente | Sim | Raramente | 4 | Acho a existência delas algo importante, mas a... |
| 380 | ... |
381 rows × 78 columns
2.3 Converting the set of spreadsheets into a relational database for table creation¶
Success: Google Sheet loaded. --- Clearing database for a fresh start --- Cleared 9 old tables. --- Creating new, complete relational schema --- Table 'Religions' created. Table 'JustificativaRelatoriaLookup' created. Table 'JustificativaFormacaoLookup' created. Table 'PapelCEUALookup' created. Table 'FuncaoAdminLookup' created. Table 'JustificativaMalNecessarioLookup' created. Table 'AvaliacaoDanosBeneficiosLookup' created. Table 'Respondents' created. Table 'SurveyAnswers' created to hold all original columns. --- Populating Lookup Tables --- Populated all available lookup tables with provided meanings. --- Populating main data tables --- Migration complete! The database 'ceua_analysis_v3.db' is correct and ready for use.
Success: Loaded data for consistency check. ====================================================================== ### DATA INTEGRITY AUDIT: SPA REPRESENTATIVE CONSISTENCY ### ====================================================================== This analysis checks for logical contradictions between Question 15 (Does an SPA representative exist?) and Question 23 (Does the SPA representative perform reporting duties?). The table below shows the number of respondents for each combination of answers.
- 5 respondent(s) answered 'No, representative does not exist' to Q15, but then answered 'Yes, performs duties' to Q23.
- IDs of these respondents: [2, 56, 229, 334, 371]
- 2 respondent(s) answered 'Don't know' to Q15, but then answered 'Yes, performs duties' to Q23.
- IDs of these respondents: [207, 276]
- 1 respondent(s) answered 'Yes, representative exists' to Q15, but then answered 'No representative exists (per Q23)' to Q23.
- IDs of these respondents: [288]
----------------------------------------------------------------------
SUMMARY: A total of 8 unique respondent(s) provided at least one logically inconsistent answer.
List of all unique inconsistent IDs: [2, 56, 207, 229, 276, 288, 334, 371]
======================================================================
3. Univariate Analysis¶
Here we are examining a single variable at a time to understand its core characteristics. The main goal is to describe and summarize the data's properties.
Success: Loaded age data for active respondents. --- Descriptive Statistics for Age Range (Translated) --- count 369 unique 5 top Between 41 and 50 years freq 134 Name: AgeRange, dtype: object --- Descriptive Statistics for Age (Estimated Numeric) --- Mean Age (estimated): 43.52 Median Age (estimated): 45.50 Standard Deviation (estimated): 9.62 --- Generating Age Distribution Plot (Translated) ---
Success: Loaded gender data for active respondents. --- Descriptive Statistics for Gender (Translated) --- count 369 unique 3 top Female freq 233 Name: Gender, dtype: object Frequency Count (Translated): Gender Female 233 Male 135 Prefer not to answer 1 Name: count, dtype: int64 --- Generating Gender Distribution Plot (Translated) ---
Success: Loaded region data for active respondents. --- Descriptive Statistics for Region (Translated) --- count 369 unique 5 top Southeast freq 179 Name: Region, dtype: object Frequency Count (Translated): Region Southeast 179 South 73 Northeast 58 North 30 Central-West 29 Name: count, dtype: int64 --- Generating Region Distribution Plot (Translated) ---
Success: Loaded state data for active respondents. --- Descriptive Statistics for State (Cleaned) --- count 369 unique 24 top SP freq 99 Name: State, dtype: object Frequency Count (Cleaned): State SP 99 MG 42 RJ 34 RS 28 PR 26 SC 18 DF 14 PA 14 BA 12 PE 9 PB 8 CE 8 SE 7 PI 7 AM 6 ES 6 MS 5 TO 5 GO 5 MA 4 MT 4 RN 3 RO 3 AC 2 Name: count, dtype: int64 --- Generating State Distribution Plot ---
Success: Loaded religion data for active respondents. --- Descriptive Statistics for Religion (Translated) --- count 369 unique 7 top Catholicism freq 168 Name: Religion, dtype: object Frequency Count (Translated): Religion Catholicism 168 Spiritism 61 No specific religion 56 Atheism/Agnosticism 33 Other Christians/Protestants 28 Afro-Brazilian Religions 13 Other/No answer 10 Name: count, dtype: int64 --- Generating Religion Distribution Plot (Translated) ---
Success: Loaded 'Vinculo' data for active respondents. --- Descriptive Statistics for Vinculo (Translated) --- count 369 unique 2 top No freq 333 Name: Vinculo, dtype: object Frequency Count (Translated): Vinculo No 333 Yes 36 Name: count, dtype: int64 --- Generating Vinculo Distribution Plot (Translated) ---
Success: Loaded raw NGO name data.
--- Question 11. NGO Names as they appear in the database ---
ONG Sou Amigo - Coordenadora
AMAA - colaboradora
Seres Viventes
SOS Animal - colaboradora
Lar Oasis
Adote um Gatinho ( voluntaria)
ALPA. RT
Samb Sociedade Amor de Bicho 1a tesoureira
Amada
APAT - ASSOCIAÇÃO DE PROTEÃO ANIMAL DE TEFÉ
Associação Melhores Amigos dos Animais - Presid...
Adota Patos
KAOSA
Viva Bichos, Defesa da Vida Animal
Instituto Espaço Silvestre - Presidente
ONG DEIXE VIVER
Refugio dos bichos
ADA - Associação Defensora dos Animais
UPAP - Voluntária
GAAR
Membro da Diretoria
SOS ANIMAIS DE RUA, voluntária
Amacap
Sospet
pastoral de protetores/apoio técnico e atendime...
Ong Animais da Aldeia - Voluntário
Associação Ouropretana de Proteção Animal
Forum Animal, consultora
Associação Defensora dos Animais São Francisco ...
Instituto Flora Vida - Voluntária
Desabandone
AMPARA, auxílio em atividade com animais da ong...
APATA, doadora de ração
SOS Bichinho
Amaa. Presidente do conselho deliberativo e fis...
Colaboradora- Santuário Anjos de Assis
NGO Affiliations of Respondents
A cleaned, alphabetized list of unique NGO names provided in the survey.
| # | NGO Name |
|---|---|
| 1 | Ada - Associação Defensora Dos Animais |
| 2 | Adota Patos |
| 3 | Adote Um Gatinho |
| 4 | Alpa |
| 5 | Amaa |
| 6 | Amacap |
| 7 | Amada |
| 8 | Ampara |
| 9 | Apat - Associação De Proteão Animal De Tefé |
| 10 | Apata |
| 11 | Associação Defensora Dos Animais São Francisco De Assis (Adasfa) |
| 12 | Associação Melhores Amigos Dos Animais |
| 13 | Associação Ouropretana De Proteção Animal |
| 14 | Desabandone |
| 15 | Forum Animal |
| 16 | Gaar |
| 17 | Instituto Espaço Silvestre |
| 18 | Instituto Flora Vida |
| 19 | Kaosa |
| 20 | Lar Oasis |
| 21 | Ong Animais Da Aldeia |
| 22 | Ong Sou Amigo |
| 23 | Pastoral De Protetores |
| 24 | Refugio Dos Bichos |
| 25 | Samb Sociedade Amor De Bicho |
| 26 | Santuário Anjos De Assis |
| 27 | Seres Viventes |
| 28 | Sos Animais De Rua |
| 29 | Sos Animal |
| 30 | Sos Bichinho |
| 31 | Sospet |
| 32 | Upap |
| 33 | Viva Bichos |
Success: Loaded unified CEUA name data for active respondents. Total number of unique CEUAs found: 232 --- Generating Top 20 Unified CEUA Name Distribution Plot ---
Success: Loaded raw institution name data. --- Generating Top 20 Cleaned Institution Name Distribution Plot ---
Success: Loaded 'Natureza_Inst' data for active respondents. --- Descriptive Statistics for Institution Nature (Translated) --- count 369 unique 4 top Public freq 249 Name: InstitutionNature, dtype: object Frequency Count (Translated): InstitutionNature Public 249 Private 102 Commercial 17 Not Informed 1 Name: count, dtype: int64 --- Generating Institution Nature Distribution Plot (Translated) ---
Success: Loaded 'Representante_SPA' data for active respondents. --- Descriptive Statistics for SPA Representative (Translated) --- count 369 unique 4 top Yes, only titular member freq 137 Name: SPARepresentative, dtype: object Frequency Count (Translated): SPARepresentative Yes, only titular member 137 Yes, titular and substitute 120 No SPA representative 76 Don't know 36 Name: count, dtype: int64 --- Generating SPA Representative Distribution Plot (Translated) ---
/tmp/ipykernel_42212/4221084812.py:89: UserWarning: set_ticklabels() should only be used with a fixed number of ticks, i.e. after set_ticks() or using a FixedLocator. ax.set_yticklabels(wrapped_labels)
Success: Loaded 'Quant_membros_CEUA' data for active respondents. --- Descriptive Statistics for CEUA Member Count --- count 369 unique 5 top Between 10 and 14 freq 161 Name: MemberCount, dtype: object Frequency Count: MemberCount Between 10 and 14 161 Between 5 and 9 88 Between 15 and 19 67 20 or more 36 Don't know 17 Name: count, dtype: int64 --- Generating CEUA Member Count Distribution Plot ---
Success: Loaded 'No_licencas_recusadas' data for active respondents. --- Descriptive Statistics for Number of Licenses Refused --- count 369 unique 13 top Don't know freq 135 Name: LicensesRefused, dtype: object Frequency Count: LicensesRefused Don't know 135 0 81 1 23 Between 10 and 14 22 2 21 3 21 20 or more 20 4 15 6 14 5 10 Between 15 and 19 4 7 2 8 1 Name: count, dtype: int64 --- Generating Licenses Refused Distribution Plot ---
Success: Loaded 'No_consultores_adhoc' data for active respondents. --- Descriptive Statistics for Number of Ad-hoc Consultants Used --- count 369 unique 14 top 0 freq 141 Name: AdHocConsultants, dtype: object Frequency Count: AdHocConsultants 0 141 Don't know 116 2 34 1 23 20 or more 17 3 15 5 7 Between 10 and 14 4 8 3 6 2 4 2 9 2 Between 15 and 19 2 7 1 Name: count, dtype: int64 --- Generating Ad-hoc Consultant Use Distribution Plot ---
Success: Loaded 'Freq_cursos' data for active respondents. --- Descriptive Statistics for Frequency of Educational Courses --- count 369 unique 5 top Rarely freq 177 Name: CourseFrequency, dtype: object Frequency Count: CourseFrequency Rarely 177 Frequently 94 Never 64 Constantly 18 Don't know 16 Name: count, dtype: int64 --- Generating Course Frequency Distribution Plot ---
Success: Loaded 'Outros_encontros' data for active respondents. --- Descriptive Statistics for Frequency of Other Meetings --- count 369 unique 5 top Rarely freq 187 Name: OtherMeetingsFrequency, dtype: object Frequency Count: OtherMeetingsFrequency Rarely 187 Never 98 Frequently 58 Constantly 16 Don't know 10 Name: count, dtype: int64 --- Generating Other Meetings Frequency Distribution Plot ---
Success: Loaded 'Metodo_decisao' data for active respondents. --- Descriptive Statistics for Decision-Making Method --- count 369 unique 12 top Voting and Consensus freq 296 Name: DecisionMethod, dtype: object Frequency Count: DecisionMethod Voting and Consensus 296 Only Consensus 44 Only Voting 18 Don't know 3 Votação com discussão prévia 1 Variavel 1 O presidente 1 SOU SUPLENTE 1 O CEUA é novo. A primeira solicitação foi enviada nesta semana. 1 Votação, Avaliação de Projetos e Relatórios 1 Consenso e votação 1 Parecer externo e avaliação do parecer pelos membros titulares 1 Name: count, dtype: int64 --- Generating Decision-Making Method Distribution Plot ---
Success: Loaded 'Periodicidade_das_reunioes' data for active respondents. --- Descriptive Statistics for Meeting Periodicity (Brittle Clean) --- count 369 unique 8 top Once a month freq 231 Name: TranslatedPeriodicity, dtype: object Frequency Count (Brittle Clean): TranslatedPeriodicity Once a month 231 Every fifteen days 39 Once every two months 38 Once per semester 24 Other 16 Don't know 13 Once a week 5 Quarterly 3 Name: count, dtype: int64 --- Generating Meeting Periodicity Distribution Plot (Brittle Clean) ---
Success: Loaded 'SPA_assume_relatoria' data for active respondents. --- Descriptive Statistics for SPA Protocol Reporting --- count 369 unique 4 top No freq 121 Name: SPAReporting, dtype: object Frequency Count: SPAReporting No 121 Yes 116 No SPA representative on CEUA 72 Don't know 60 Name: count, dtype: int64 --- Generating SPA Protocol Reporting Distribution Plot ---
Success: Loaded opinion data on SPA reporting for active respondents. --- Descriptive Statistics for Opinion on SPA Reporting --- count 369 unique 3 top Yes freq 182 Name: OpinionSPAReporting, dtype: object Frequency Count: OpinionSPAReporting Yes 182 No 127 I have no opinion 60 Name: count, dtype: int64 --- Generating Opinion on SPA Reporting Distribution Plot ---
Success: Loaded and joined data for Q24 and Q25. --- Contingency Table (Counts) --- Opinion No No Opinion Yes Justification Based on equality/isonomy 0 1 61 Brings positive contributions 0 1 35 Conditional (with training/supervision) 10 13 50 Lack of engagement from SPA 5 4 2 No opinion / Not classifiable 7 35 7 Not the function of an SPA representative 14 2 0 Risk of partiality / Lacks competence 86 1 2 SPA is competent for the role 5 3 25 --- Generating Visualization ---
Success: Loaded 'SPA_deve_ter_formacao' data for active respondents. --- Descriptive Statistics for Opinion on SPA Higher Education --- count 369 unique 3 top Yes freq 209 Name: SPAFormationNeeded, dtype: object Frequency Count: SPAFormationNeeded Yes 209 No 128 No opinion 32 Name: count, dtype: int64 --- Generating Opinion Distribution Plot ---
Success: Loaded and joined justification data for active respondents. --- Descriptive Statistics for Justification of SPA Higher Education --- count 369 unique 6 top Necessary for technical understanding freq 203 Name: Justification, dtype: object Frequency Count: Justification Necessary for technical understanding 203 Not necessary / Can be trained 107 No opinion / Not classifiable 21 Requirement could restrict participation 17 Relevant knowledge isn't from academia 12 To ensure objectivity / Avoid bias 9 Name: count, dtype: int64 --- Generating Justification Distribution Plot ---
Success: Loaded 'Papel_na_CEUA' data for active respondents. --- Descriptive Statistics for CEUA Role --- count 369 unique 6 top Professor/Lecturer freq 100 Name: CEUARole, dtype: object Frequency Count: CEUARole Professor/Lecturer 100 Veterinarian 95 Researcher 67 Other 48 Biologist 47 Animal Protection Society Rep. 12 Name: count, dtype: int64 --- Generating CEUA Role Distribution Plot ---
Success: Loaded unified academic area data for active respondents. --- Descriptive Statistics for Primary Academic Area (Cleaned & Aggregated) --- count 369 unique 7 top Veterinary Medicine freq 134 Name: 0, dtype: object Frequency Count (Aggregated for Plot): 0 Veterinary Medicine 134 Biological Sciences 82 Health Sciences 50 Agrarian Sciences (except Veterinary Medicine) 48 Pharmacy / Chemistry 26 Humanities 18 Other 11 Name: count, dtype: int64 --- Generating Primary Academic Background Distribution Plot ---
Success: Loaded education level data for active respondents. --- Descriptive Statistics for Education Level (Revised Categories) --- count 369 unique 5 top Doctorate (PhD) freq 171 Name: EducationLevel, dtype: object Frequency Count (Revised Categories): EducationLevel Doctorate (PhD) 171 Post-doctorate / Habilitation 96 Master's Degree 66 Undergraduate Degree 33 No University Degree 3 Name: count, dtype: int64 --- Generating Education Level Distribution Plot (Revised Categories) ---
Success: Loaded data for animal use areas. --- Frequency Count of All Mentioned Animal Use Areas (Corrected) --- AnimalUseAreas Basic research 161 Teaching 148 Applied research 121 Agricultural research 91 I do not use animals 59 Disease diagnosis 58 Drug testing 47 Toxicological testing 45 Vaccines / Immunology 34 Other/NA 33 Maintenance of microorganisms 15 Maintenance of invertebrates 15 Genetic engineering 4 Military research 1 Cosmetic testing 1 Name: count, dtype: int64 --- Generating Distribution Plot (Corrected) ---
Success: Loaded and joined data for administrative functions. --- Frequency Count of Administrative Functions --- AdminFunction_Translated No administrative function 263 Coordinator 70 Vice-Coordinator 29 Secretary 7 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Time on CEUA. --- Descriptive Statistics for Time on CEUA (in Years) --- count 369.000000 mean 4.157859 std 3.562950 min 0.250000 25% 1.750000 50% 2.250000 75% 6.000000 max 22.000000 Name: TimeOnCEUA, dtype: float64 --- Generating Distribution Plot ---
Success: Loaded data for Formal Ethics Knowledge. --- Descriptive Statistics for Formal Ethics Knowledge --- count 369.000000 mean 3.859079 std 0.891799 min 1.000000 25% 3.000000 50% 4.000000 75% 4.000000 max 5.000000 Name: EthicsKnowledge, dtype: float64 --- Frequency Count --- EthicsKnowledge 1 6 2 15 3 95 4 162 5 91 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Ethics Education Quality. --- Descriptive Statistics for Ethics Education Quality --- count 369.000000 mean 2.728997 std 1.385974 min 0.000000 25% 2.000000 50% 3.000000 75% 4.000000 max 5.000000 Name: EthicsEducationScale, dtype: float64 --- Frequency Count --- EthicsEducationScale 0 14 1 67 2 84 3 92 4 64 5 48 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Animal Welfare Education Quality. --- Descriptive Statistics for Animal Welfare Education Quality --- count 369.000000 mean 2.531165 std 1.615055 min 0.000000 25% 1.000000 50% 3.000000 75% 4.000000 max 5.000000 Name: WelfareEducationScale, dtype: float64 --- Frequency Count --- WelfareEducationScale 0 46 1 73 2 60 3 71 4 67 5 52 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for 'Disciplinas Cursadas' (Q38). --- Frequency Counts by Course and Academic Level (Corrected) --- Academic_Level Doctorate Extension Course Master's Specialization \ Course Animal Ethics 44 86 47 19 Animal Rights 20 71 19 19 Animal Welfare 69 127 72 36 Bioethics 53 74 69 29 Ethology 34 63 33 15 Moral Philosophy 14 29 13 7 Research Ethics 100 82 122 33 Academic_Level Undergraduate Course Animal Ethics 64 Animal Rights 30 Animal Welfare 103 Bioethics 125 Ethology 110 Moral Philosophy 73 Research Ethics 74 --- Generating Visualization (Corrected) ---
Success: Loaded data for Ethical Aptitude. --- Descriptive Statistics for Ethical Aptitude --- count 369.000000 mean 4.027100 std 0.762093 min 1.000000 25% 4.000000 50% 4.000000 75% 5.000000 max 5.000000 Name: EthicalAptitude, dtype: float64 --- Frequency Count --- EthicalAptitude 1 3 2 7 3 63 4 200 5 96 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Animal Welfare Aptitude. --- Descriptive Statistics for Animal Welfare Aptitude --- count 369.000000 mean 4.073171 std 0.861344 min 0.000000 25% 4.000000 50% 4.000000 75% 5.000000 max 5.000000 Name: WelfareAptitude, dtype: float64 --- Frequency Count --- WelfareAptitude 0 3 1 3 2 7 3 55 4 184 5 117 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Protocol Comprehension Difficulty. --- Descriptive Statistics for Protocol Difficulty --- count 369 unique 5 top Rarely freq 264 Name: ProtocolDifficulty_Translated, dtype: object --- Frequency Count --- ProtocolDifficulty_Translated Rarely 264 Frequently 61 Never 31 Constantly 7 Don't Know 6 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Necessity of Animal Models. --- Descriptive Statistics for Necessity of Animal Models --- count 369.000000 mean 3.747967 std 1.230899 min 0.000000 25% 3.000000 50% 4.000000 75% 5.000000 max 5.000000 Name: NecessityScale, dtype: float64 --- Frequency Count --- NecessityScale 0 4 1 17 2 39 3 78 4 101 5 130 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Necessity of Animals for Food. --- Descriptive Statistics for Necessity of Animals for Food --- count 369.000000 mean 3.598916 std 1.445232 min 0.000000 25% 3.000000 50% 4.000000 75% 5.000000 max 5.000000 Name: NecessityScale, dtype: float64 --- Frequency Count --- NecessityScale 0 16 1 26 2 29 3 85 4 76 5 137 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for 'Necessary Evil' opinion. --- Descriptive Statistics for 'Necessary Evil' Opinion --- count 369 unique 3 top No freq 168 Name: Opinion_Translated, dtype: object --- Frequency Count --- Opinion_Translated No 168 Yes 166 Don't Know 35 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded and joined data for 'Necessary Evil' justifications. --- Frequency Count of Justifications --- Justification_Translated Disagrees that it is an evil 91 Limitations of alternatives 66 Advances in science and benefits 58 Confidence in alternative methods 41 Conditional (necessary in some cases) 33 Critique of experimentation 19 Confidence in ethics and responsibility 16 No opinion / Not classifiable 13 Highlights harm to the animal 2 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Perceived Suffering in Experiments. --- Descriptive Statistics for Perceived Suffering Scale --- count 369.000000 mean 2.203252 std 1.580929 min 0.000000 25% 1.000000 50% 2.000000 75% 3.000000 max 5.000000 Name: SufferingScale, dtype: float64 --- Frequency Count --- SufferingScale 0 59 1 91 2 57 3 80 4 42 5 40 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded and joined data for Harm-Benefit evaluation criteria. --- Frequency Count of Harm-Benefit Criteria --- Criteria_Translated Relevance and Benefits 84 Refinement (3Rs) 60 Knowledge of Evaluators 51 Quality of Research 50 Not classifiable 39 Evaluation Process 36 Ethical Principles 23 Replacement (3Rs) 13 The 3Rs (in conjunction) 11 Reduction (3Rs) 2 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for acceptable animal uses. --- Frequency Count of Acceptable Animal Use Categories --- AcceptableUses Food production / Agriculture 295 Applied research 289 Agricultural research 239 Basic research 239 Teaching 206 Toxicity testing 191 Regulatory testing of food products 156 Other 151 Regulatory testing of substance & material exposure 140 Zoo 115 Sports 59 Cosmetic & personal hygiene product testing 49 Clothing 35 Military research 28 I do not consider any use of animals acceptable 13 I have no opinion on this 4 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Perceived Suffering Intensity. --- Descriptive Statistics for Perceived Suffering Intensity --- count 369.000000 mean 1.815718 std 1.199489 min 0.000000 25% 1.000000 50% 2.000000 75% 3.000000 max 5.000000 Name: SufferingIntensity, dtype: float64 --- Frequency Count --- SufferingIntensity 0 45 1 122 2 97 3 75 4 21 5 9 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Vegan/Vegetarian status. --- Descriptive Statistics for Vegan/Vegetarian Status --- count 369 unique 2 top No freq 325 Name: IsVegan_Translated, dtype: object --- Frequency Count --- IsVegan_Translated No 325 Yes 44 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Animal Role Terminology. --- Frequency Count of Animal Role Terminology --- Term_Translated Participants of research 138 Subjects of research 124 Objects of research 44 Don't know 25 Research inputs 20 Other 18 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Comfort to Express Position. --- Descriptive Statistics for Comfort to Express Position --- count 369.000000 mean 4.604336 std 0.766544 min 1.000000 25% 4.000000 50% 5.000000 75% 5.000000 max 5.000000 Name: ComfortScale, dtype: float64 --- Frequency Count --- ComfortScale 1 3 2 6 3 28 4 60 5 272 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Perceived Discrimination. --- Descriptive Statistics for Perceived Discrimination --- count 369 unique 5 top Never freq 287 Name: Frequency_Translated, dtype: object --- Frequency Count --- Frequency_Translated Never 287 Rarely 56 Don't Know 14 Constantly 7 Frequently 5 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for Perceived Respect. --- Descriptive Statistics for Perceived Respect --- count 369 unique 5 top Constantly freq 289 Name: Frequency_Translated, dtype: object --- Frequency Count --- Frequency_Translated Constantly 289 Frequently 60 Don't Know 11 Rarely 6 Never 3 Name: count, dtype: int64 --- Generating Distribution Plot ---
Success: Loaded data for consistency check. ====================================================================== ### DATA INTEGRITY AUDIT: SPA REPRESENTATIVE CONSISTENCY ### ====================================================================== This analysis checks for logical contradictions between Question 15 (Does an SPA representative exist?) and Question 23 (Does the SPA representative perform reporting duties?). The table below shows the number of respondents for each combination of answers.
- 5 respondent(s) answered 'No, representative does not exist' to Q15, but then answered 'Yes, performs duties' to Q23.
- IDs of these respondents: [2, 56, 229, 334, 371]
- 2 respondent(s) answered 'Don't know' to Q15, but then answered 'Yes, performs duties' to Q23.
- IDs of these respondents: [207, 276]
- 1 respondent(s) answered 'Yes, representative exists' to Q15, but then answered 'No representative exists (per Q23)' to Q23.
- IDs of these respondents: [288]
----------------------------------------------------------------------
SUMMARY: A total of 8 unique respondent(s) provided at least one logically inconsistent answer.
List of all unique inconsistent IDs: [2, 56, 207, 229, 276, 288, 334, 371]
======================================================================
3. Correlations¶
3.1 Correlation Matrix¶
Success: Loaded raw data from database.
Preprocessing complete. Matrix will be built with 369 complete rows.
================================================================================
### Data Dictionary & Methodology for Correlation Matrix ###
================================================================================
--- I. Variable Definitions and Preprocessing ---
- Stance_Critical:
Target Variable. Derived from 'Justifique...codificado(45)'. Binary. Codes
2, 3, 9 ('Critical') mapped to 1; others mapped to 0 ('Favorable').
- Is_Vegan:
Derived from '50. Vegano_ou_vegetariano_binario'. Binary. 'Sim' mapped to 1,
otherwise 0.
- Is_Animal_User:
Derived from '31. Area_em_que_usa_animais_lista'. Binary. Presence of text
indicating animal use mapped to 1, 'não uso' or empty mapped to 0.
- NGO_Affiliation:
Derived from '9. Vinculo'. Binary. 'Sim' mapped to 1, otherwise 0.
- Age:
Derived from '3. Idade'. Ordinal Text. Mapped ranges 'Até 30 anos' through
'Mais de 70 anos' to a numeric scale of 0 to 5.
- Education:
Derived from '30. Escolaridade'. Ordinal Text. Mapped 'Graduação' through
'Pós doutorado' to a numeric scale of 0 to 4.
- Role_Code:
Derived from 'Papel_na_CEUA_codificado(28)'. Coded Categorical (Nominal).
Used directly. Note: Correlation with nominal codes should be interpreted
with caution.
- Admin_Function_Code:
Derived from 'Funcao_administrativa_na_CEUA_codificado(32)'. Coded
Categorical (Nominal). Used directly.
- Time_on_CEUA:
Derived from 'tempo_de_CEUA_em_anos_avg(33,34)'. Continuous. Used directly
after converting to numeric.
- Knowledge_Ethics:
Derived from '35. Conhecimento_formal_em_etica_escala'. Ordinal Scale (1-5).
Used directly.
- Aptitude_Ethics:
Derived from '39. Aptidao_para_avaliacoes_eticas_escala'. Ordinal Scale
(1-5). Used directly.
- Aptitude_Welfare:
Derived from '40. Aptidao_para_avaliacoes_de_bem-estar_animal_escala'.
Ordinal Scale (1-5). Used directly.
- Need_Animal_Models:
Derived from '42. Necessidade_de_uso_de_modelos_animais_escala'. Ordinal
Scale (1-5). Used directly.
- Need_Animals_Food:
Derived from '43. Necessidade_de_uso_de_animais_na_alimentacao_escala'.
Ordinal Scale (1-5). Used directly.
- Suffering_Implied:
Derived from '46.
Experimentos_cientificos_implicam_sofrimento_animal_escala'. Ordinal Scale
(1-5). Used directly.
- Willingness_to_Speak:
Derived from '53. Vontade_para_manifestar_posicionamento'. Mixed-Type
Ordinal. Processed to a numeric scale 0-3.
- Feeling_Discriminated:
Derived from '54. Sente-se_discriminado_nas_reunioes'. Mixed-Type Ordinal.
Processed to a numeric scale 0-3.
- Opinions_Respected:
Derived from '55. Suas_colocacoes_sao_respeitadas'. Mixed-Type Ordinal.
Processed to a numeric scale 0-3.
- Resistance_to_Proposals:
Derived from '56. Ha_resistencia_as_suas_propostas'. Mixed-Type Ordinal.
Processed to a numeric scale 0-3.
- Peers_Same_Concern:
Derived from '57. Demais_membros_atribuem_o_mesmo_nivel_de_preocupacao'.
Mixed-Type Ordinal. Processed to a numeric scale 0-3.
- Users_Minimize_Suffering:
Derived from '62. Membros_que_pesquisam...minimizam_o_sofrimento...'. Mixed-
Type Ordinal. Processed to a numeric scale 0-3.
- CEUA_Uses_3Rs:
Derived from '63. O_quanto_a_CEUA_se_baseia_no_principio_dos_3Rs_escala'.
Ordinal Scale (1-5). Used directly.
--- II. Methodology ---
1. Data Selection: 369 complete rows from active respondents (Ativo=1) were used for this analysis.
2. Correlation Method: Spearman's Rank Correlation (ρ) was computed for all pairs of variables. This non-parametric method was chosen as it is suitable for measuring monotonic relationships between ordinal variables, which constitute the majority of the data, without assuming linearity.
3. Handling Missing Data: After initial processing, any remaining missing values (NaNs), primarily from 'Não sei dizer' responses or conversion errors, were imputed using the median of their respective columns. This strategy preserves the full sample size but may introduce a conservative bias by slightly reducing variance.
4. Visualization: The resulting correlation matrix is displayed as a heatmap. A diverging colormap ('vlag') centered at zero is used to clearly distinguish positive (red) from negative (blue) associations. The upper triangle is masked to reduce redundancy.
================================================================================
Success: Loaded data for analysis. --- Contingency Table (Observed Frequencies) --- OpinionNecessaryEvil Não Não sei dizer Sim StanceOnExperimentation Non-user 29 8 25 User 139 27 141 ================================================== ### CHI-SQUARE TEST RESULTS ### ================================================== [1] HYPOTHESIS: H₀ (Null Hypothesis): There is NO association between using animals in work and the opinion on animal experimentation being a 'necessary evil'. H₁ (Alternative Hypothesis): Whether an individual uses animals in their work is associated with their opinion on animal experimentation being a 'necessary evil'. [2] P-VALUE: The calculated p-value is: 0.5211 [3] CONCLUSION: Since the p-value (0.5211) is greater than our significance level (0.05), we FAIL TO REJECT the null hypothesis. This means we do not have sufficient evidence from our data to conclude that there is an association between the two variables. [4] EFFECT SIZE: Cramer's V: 0.0594 Cramer's V measures the strength of the association (0=none, 1=perfect). For this test (df=2), a value around 0.1 is small, 0.3 is medium, and 0.5 is large. ==================================================
Success: Loaded data for analysis. Data Cleaning: Retained 234 of 369 respondents with complete data. ====================================================================== ### MANN-WHITNEY U TEST: SPA PRESENCE VS. AD-HOC USE ### ====================================================================== This analysis investigates whether the frequency of using ad-hoc consultants differs between CEUAs with an SPA representative and those without. The Mann-Whitney U test, a non-parametric method, was chosen to compare the distributions of an ordinal variable between these two independent groups. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀: The distributions of ad-hoc consultant usage are IDENTICAL for both groups. H₁: The distributions of ad-hoc consultant usage are DIFFERENT for the two groups. ---------------------------------------------------------------------- [2] Descriptive Statistics ---------------------------------------------------------------------- Median Ad-Hoc Use (SPA Rep. Exists): 0.00 (n=176) Median Ad-Hoc Use (No SPA Rep.): 1.00 (n=58) ---------------------------------------------------------------------- [3] Statistical Results ---------------------------------------------------------------------- Mann-Whitney U statistic: 4368.5 P-value: 0.0709 ---------------------------------------------------------------------- [4] Analytical Conclusion ====================================================================== The p-value (0.0709) is not less than our significance level of 0.05. Therefore, we FAIL to reject the Null Hypothesis. We do not have sufficient statistical evidence to conclude that a difference exists between the two groups. ======================================================================
Success: Loaded data for analysis. --- Contingency Table (Observed Frequencies) --- This table shows the number of respondents in each categorized stance. JustificationBucket Critical Favorable StanceOnExperimentation Non-user 21 37 User 41 257 --- Generating Visualization: Grouped Bar Chart ---
================================================== ### CHI-SQUARE TEST RESULTS ### ================================================== [1] HYPOTHESIS: H₀ (Null Hypothesis): There is NO association between using animals in work and their justification stance on animal experimentation. H₁ (Alternative Hypothesis): Whether an individual uses animals in their work is associated with their justification stance (Favorable vs. Critical) on animal experimentation. [2] P-VALUE: The calculated p-value is: 0.0001 [3] CONCLUSION: Since the p-value (0.0001) is less than our significance level (0.05), we REJECT the null hypothesis. This indicates a statistically significant association between a person's status as an animal user and their justification stance. [4] EFFECT SIZE: Cramer's V: 0.2086 Cramer's V measures the strength of the association (from 0 to 1). Typical interpretations for this test's degrees of freedom (df=1) are: ~0.1 (small), ~0.3 (medium), ~0.5 (large effect). ==================================================
Success: Loaded data for analysis. --- Contingency Table (Observed Frequencies) --- JustificationBucket Critical Favorable EthicsKnowledge 1 1 5 2 1 14 3 19 72 4 27 126 5 14 77 ================================================== ### CHI-SQUARE TEST RESULTS ### ================================================== [1] HYPOTHESIS: H₀ (Null Hypothesis): There is NO association between ethics knowledge and justification stance. H₁ (Alternative Hypothesis): A respondent's self-assessed formal knowledge in ethics is associated with their justification stance (Favorable vs. Critical) on animal experimentation. [2] P-VALUE: The calculated p-value is: 0.6930 [3] CONCLUSION: Since the p-value (0.6930) is greater than our significance level (0.05), we FAIL TO REJECT the null hypothesis. This means we do not have sufficient evidence from our data to conclude that there is an association between the two variables. [4] EFFECT SIZE: Cramer's V: 0.0792 Cramer's V measures the strength of the association (from 0 to 1). For this test's degrees of freedom (df=4), interpretations can be guided by: ~0.1 (small), ~0.3 (medium), ~0.5 (large effect). ==================================================
Success: Loaded data for analysis. Sample sizes: Vegan/Vegetarian (n=44) Non-Vegan/Vegetarian (n=325)
/tmp/ipykernel_8740/3014332350.py:70: UserWarning: set_ticklabels() should only be used with a fixed number of ticks, i.e. after set_ticks() or using a FixedLocator. ax.set_xticklabels(['Non-Vegan/Vegetarian', 'Vegan/Vegetarian'])
============================================================ ### MANN-WHITNEY U TEST: DETAILED ANALYTICAL REPORT ### ============================================================ The Mann-Whitney U test is a non-parametric test used to determine if there is a significant difference between two independent groups on an ordinal or continuous variable. We chose this test because our 'SufferingScale' data is ordinal (ranked) and not assumed to be normally distributed, making a standard t-test inappropriate. ------------------------------------------------------------ [1] Stating the Hypotheses ------------------------------------------------------------ We formally state our research question as a testable pair of hypotheses. The Null Hypothesis (H₀) is the default assumption of no difference, while the Alternative Hypothesis (H₁) is what we are testing for. H₀: The distributions of perceived suffering scores are IDENTICAL for both groups. H₁: The distributions of perceived suffering scores are DIFFERENT for the two groups. ------------------------------------------------------------ [2] Descriptive Statistics: A First Look at the Data ------------------------------------------------------------ Before testing our hypothesis, we examine the central tendency of each group. Since the data is ordinal, the **median** (the middle value) is the most appropriate measure. It tells us the score at which 50% of the group responded at or above. Median Score (Vegan/Vegetarian Group): 3.00 Median Score (Non-Vegan/Vegetarian Group): 2.00 Observation: The median for the vegan/vegetarian group is higher. The next step is to determine if this observed difference is statistically significant or likely due to random chance. ------------------------------------------------------------ [3] Inferential Statistics: Testing for Significance ------------------------------------------------------------ This is the core of the test. The **p-value** represents the probability of observing a difference this large (or larger) between our groups purely by chance, assuming H₀ is true. A small p-value (typically < 0.05) suggests the observed difference is real. Mann-Whitney U statistic: 9811.0 Calculated p-value: 0.0000 ------------------------------------------------------------ [4] Effect Size: Measuring the Magnitude of the Difference ------------------------------------------------------------ A small p-value tells us the difference is significant, but not how *large* it is. For this, we calculate the **Rank- Biserial Correlation (r)**. This value ranges from -1 to 1 and measures the strength of the difference between the groups. Rank-Biserial Correlation (r): 0.3722 Interpretation Guide: |0.1| is a small effect, |0.3| is a medium effect, and |0.5| is a large effect. The positive sign indicates the first group (Vegan/Vegetarian) tends to have higher scores. ------------------------------------------------------------ [5] Analytical Conclusion ------------------------------------------------------------ The p-value (0.0000) is less than our significance level of 0.05. Therefore, we **reject the Null Hypothesis**. There is strong statistical evidence to conclude that a significant difference exists in the distribution of perceived suffering scores between vegans/vegetarians and non-vegetarians, with the vegan/vegetarian group tending to report higher scores. The effect size (r=0.37) indicates that the magnitude of this difference is medium to large. ============================================================
Success: Loaded data for animal use and suffering perception.
--- Descriptive Statistics by Group ---
count mean std min 25% 50% 75% max
UsesAnimals
Non-User 54.0 3.074074 1.502851 0.0 2.0 3.0 4.0 5.0
User 315.0 2.053968 1.547633 0.0 1.0 2.0 3.0 5.0
--- Generating Visualization ---
--- Chi-Squared Test of Independence --- Contingency Table: SufferingScale 0 1 2 3 4 5 UsesAnimals Non-User 3 6 10 12 11 12 User 56 85 47 68 31 28 Chi-Squared Statistic (χ²): 21.4611 P-value: 0.0007 Degrees of Freedom: 5 --- Interpretation of Results --- Significance level (α): 0.05 Null Hypothesis (H₀): There is no association between a respondent's status as an animal user and their level of agreement that scientific experiments imply suffering. Conclusion: Since the p-value (0.0007) is less than 0.05, we reject the null hypothesis. There is a statistically significant association between using animals professionally and the perception of animal suffering in experiments. Assumption Check: All expected cell frequencies are 5 or greater. The test is considered reliable.
====================================================================== ### SPEARMAN'S RANK CORRELATION: DETAILED ANALYTICAL REPORT ### ====================================================================== Spearman's rank correlation (rho, ρ) is a non-parametric test that measures the strength and direction of a monotonic relationship between two ranked or ordinal variables. Unlike Pearson's correlation, it does not assume a linear relationship, making it ideal here. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): There is NO monotonic association between a member's time on a CEUA and their self-perceived ethical aptitude. H₁ (Alternative Hypothesis): There IS a monotonic association between the two variables. ---------------------------------------------------------------------- [2] Inferential Statistics & Effect Size ---------------------------------------------------------------------- The correlation coefficient (rho) is itself a measure of effect size. Spearman's rho (ρ): 0.1122 Calculated p-value: 0.0312 Interpretation Guide for |ρ|: • 0.00 - 0.30: Weak correlation • 0.30 - 0.60: Moderate correlation • 0.60 - 1.00: Strong correlation The sign of ρ indicates the direction (positive or negative). ---------------------------------------------------------------------- [3] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.0312) is less than our significance level of 0.05. Therefore, we REJECT the Null Hypothesis. There is a statistically significant, although weak, positive monotonic relationship between time served on a CEUA and self-perceived ethical aptitude. As experience increases, aptitude tends to increase as well. ======================================================================
====================================================================== ### VALIDITY CHECKS FOR SPEARMAN CORRELATION ### ====================================================================== [1] Sensitivity Analysis (excluding TimeOnCEUA > 15 years) ---------------------------------------------------------------------- Original sample size: 369 Filtered sample size: 364 Original rho: 0.1122 (p-value: 0.0312) Filtered rho: 0.1030 (p-value: 0.0496) Conclusion: Compare the original and filtered results. If they are very similar, our initial conclusion is robust. If they differ significantly, the outliers had a strong influence. [2] Bootstrapping (1000 iterations) ---------------------------------------------------------------------- 95% Confidence Interval for rho: [0.0008, 0.2101] Conclusion: A confidence interval tells us the range where the true correlation likely lies. If this interval does not contain 0, we can be confident that the relationship is statistically significant and stable. ======================================================================
Success: Loaded data for ethics education and ethical aptitude. Data Cleaning: Retained 369 of 369 respondents with complete data. --- Generating Visualization ---
====================================================================== ### SPEARMAN'S RANK CORRELATION: DETAILED ANALYTICAL REPORT ### ====================================================================== This analysis investigates the relationship between two ordinal variables: the perceived quality of a respondent's ethics education and their self-assessed aptitude for conducting ethical evaluations. Spearman's rank correlation (rho) is the appropriate non-parametric test to measure the strength and direction of a monotonic relationship between such ranked data. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): There is NO monotonic correlation between the variables. H₁ (Alternative Hypothesis): There IS a monotonic correlation between the variables. ---------------------------------------------------------------------- [2] Statistical Results ---------------------------------------------------------------------- Spearman's Correlation Coefficient (rho): 0.3517 P-value: 0.0000 Sample Size (n): 369 ---------------------------------------------------------------------- [3] Interpretation ---------------------------------------------------------------------- The p-value (0.0000) is less than our significance level of 0.05. Therefore, we REJECT the Null Hypothesis. The observed correlation is statistically significant and is unlikely to be due to random chance. The correlation coefficient (rho = 0.35) indicates a weak positive monotonic relationship. This means that as the perceived quality of ethics education increases, the self-assessed aptitude for ethical evaluations also tends to increase. ---------------------------------------------------------------------- [4] Final Conclusion ====================================================================== There is a statistically significant, positive relationship between how well respondents feel they were taught ethics and how prepared they feel to make ethical evaluations. Those who report a more sufficient educational background in ethics also tend to report a higher aptitude for the task. The strength of this relationship is weak. ======================================================================
Success: Loaded data for animal welfare education and aptitude. Data Cleaning: Retained 369 of 369 respondents with complete data. --- Generating Visualization 1: Jittered Scatterplot ---
====================================================================== ### SPEARMAN'S RANK CORRELATION: ANIMAL WELFARE REPORT ### ====================================================================== This analysis investigates the relationship between the perceived quality of a respondent's animal welfare education and their self- assessed aptitude for conducting animal welfare evaluations. Spearman's rank correlation (rho) is the appropriate test for this pair of ordinal variables. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): There is NO monotonic correlation between the variables. H₁ (Alternative Hypothesis): There IS a monotonic correlation between the variables. ---------------------------------------------------------------------- [2] Statistical Results ---------------------------------------------------------------------- Spearman's Correlation Coefficient (rho): 0.2406 P-value: 0.0000 Sample Size (n): 369 ---------------------------------------------------------------------- [3] Interpretation ---------------------------------------------------------------------- The p-value (0.0000) is less than our significance level of 0.05. Therefore, we REJECT the Null Hypothesis. The observed correlation is statistically significant. The correlation coefficient (rho = 0.24) indicates a weak positive monotonic relationship. This means that as the perceived quality of animal welfare education increases, the self-assessed aptitude for animal welfare evaluations also tends to increase. ---------------------------------------------------------------------- [4] Final Conclusion ====================================================================== There is a statistically significant, positive relationship between how well respondents feel they were taught animal welfare and how prepared they feel to make such evaluations. The strength of this relationship is weak. ======================================================================
Success: Loaded data for ethics knowledge and protocol difficulty. Data Cleaning: Retained 363 of 369 respondents with complete data. --- Generating Visualization 1: Jittered Scatterplot ---
--- Generating Visualization 2: Bar Plot of Median Knowledge ---
====================================================================== ### SPEARMAN'S RANK CORRELATION: KNOWLEDGE VS. DIFFICULTY ### ====================================================================== This analysis investigates the relationship between a respondent's self-assessed formal knowledge in ethics and the frequency with which they report difficulty understanding research protocols. Spearman's rank correlation (rho) is used to measure the monotonic relationship between these two ordinal variables. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): There is NO monotonic correlation between the variables. H₁ (Alternative Hypothesis): There IS a monotonic correlation between the variables. ---------------------------------------------------------------------- [2] Statistical Results ---------------------------------------------------------------------- Spearman's Correlation Coefficient (rho): -0.2305 P-value: 0.0000 Sample Size (n): 363 ---------------------------------------------------------------------- [3] Interpretation ---------------------------------------------------------------------- The p-value (0.0000) is less than our significance level of 0.05. Therefore, we REJECT the Null Hypothesis. The observed correlation is statistically significant. The correlation coefficient (rho = -0.23) indicates a weak negative monotonic relationship. This means that as self-assessed formal ethics knowledge increases, the reported frequency of difficulty in understanding protocols tends to decrease. ---------------------------------------------------------------------- [4] Final Conclusion ====================================================================== There is a statistically significant, negative relationship between a respondent's self-assessed formal knowledge in ethics and the frequency of difficulty they report in understanding protocols. The strength of this relationship is weak. In practical terms, individuals who rate their ethics knowledge higher tend to report facing difficulties in protocol comprehension less often. ======================================================================
====================================================================== ### KRUSKAL-WALLIS H-TEST: DETAILED ANALYTICAL REPORT ### ====================================================================== The Kruskal-Wallis H-Test is a non-parametric method used to determine if there are statistically significant differences between two or more independent groups on an ordinal or continuous dependent variable. It is the non-parametric equivalent of a one-way ANOVA. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): The distributions of the perceived 3Rs application scores are IDENTICAL for all professional roles on the CEUA. H₁ (Alternative Hypothesis): The distribution of perceived 3Rs application scores is DIFFERENT for at least one professional role. ---------------------------------------------------------------------- [2] Descriptive Statistics: A First Look at the Data ---------------------------------------------------------------------- The median score for each group provides a measure of central tendency: Role Animal Protection Society Rep. 4.0 Faculty/Lecturer 4.0 Representative (Other Areas) 4.0 Not Specified 4.0 Veterinarian 4.0 Ad-hoc Consultant 5.0 Biologist 5.0 Researcher 5.0 Observation: There appears to be some variation in the median scores across roles. The test will determine if these differences are statistically significant. ---------------------------------------------------------------------- [3] Inferential Statistics: Testing for Significance ---------------------------------------------------------------------- Kruskal-Wallis H-statistic: 7.3747 Calculated p-value: 0.3909 ---------------------------------------------------------------------- [4] Effect Size: Measuring the Magnitude of the Difference ---------------------------------------------------------------------- Epsilon-squared (ε²) estimates the proportion of variance in the scores that is explained by the different roles. A common guide for interpretation is: ~0.01 (small), ~0.08 (medium), and ~0.26 (large effect). Epsilon-squared (ε²): 0.0010 ---------------------------------------------------------------------- [5] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.3909) is greater than our significance level of 0.05. Therefore, we FAIL TO REJECT the Null Hypothesis. We do not have sufficient statistical evidence to conclude that a difference in perception exists across the different roles based on this data. ======================================================================
Success: Loaded data for analysis.
======================================================================
### KRUSKAL-WALLIS H-TEST: DETAILED ANALYTICAL REPORT ###
======================================================================
This analysis investigates whether the perceived necessity of using
animals for food differs across religious groups. Since we are
comparing an ordinal dependent variable (Necessity_Food scale) across
multiple independent nominal groups (Religion), the Kruskal-Wallis
H-Test is the appropriate non-parametric method.
----------------------------------------------------------------------
[1] Stating the Hypotheses
----------------------------------------------------------------------
H₀ (Null Hypothesis): The distributions of the 'necessity of animals for food' scores are IDENTICAL for all religious groups.
H₁ (Alternative Hypothesis): The distribution of scores is DIFFERENT for at least one religious group.
----------------------------------------------------------------------
[2] Descriptive Statistics: A First Look at the Data
----------------------------------------------------------------------
The median score for each group provides a measure of central tendency:
Religion
Other 3.0
Spiritism 3.0
No Religion 3.0
Agnosticism/Atheism 4.0
Catholicism 4.0
Evangelical 4.0
Afro-Brazilian 5.0
Name: Necessity_Food, dtype: float64
Observation: There appears to be some variation in the median scores across groups. The test will determine if these differences are statistically significant.
----------------------------------------------------------------------
[3] Inferential Statistics: Testing for Significance
----------------------------------------------------------------------
Kruskal-Wallis H-Test (Omnibus Test):
H-statistic: 12.9912
P-value: 0.0432
[4] Analytical Conclusion
----------------------------------------------------------------------
The p-value (0.0432) is less than our significance level of 0.05.
Therefore, we REJECT the Null Hypothesis. There is statistically
significant evidence to conclude that the perceived necessity of using
animals for food differs across religious groups.
Post-Hoc Analysis (Dunn's Test with Bonferroni Correction):
The following table shows the p-values for pairwise comparisons:
Afro-Brazilian Agnosticism/Atheism Catholicism \
Afro-Brazilian 1.0000 1.0 1.0000
Agnosticism/Atheism 1.0000 1.0 1.0000
Catholicism 1.0000 1.0 1.0000
Evangelical 1.0000 1.0 1.0000
No Religion 0.7927 1.0 1.0000
Other 1.0000 1.0 1.0000
Spiritism 0.3023 1.0 0.1524
Evangelical No Religion Other Spiritism
Afro-Brazilian 1.0 0.7927 1.0 0.3023
Agnosticism/Atheism 1.0 1.0000 1.0 1.0000
Catholicism 1.0 1.0000 1.0 0.1524
Evangelical 1.0 1.0000 1.0 1.0000
No Religion 1.0 1.0000 1.0 1.0000
Other 1.0 1.0000 1.0 1.0000
Spiritism 1.0 1.0000 1.0 1.0000
======================================================================
--- Generating Visualization ---
Success: Loaded data for analysis. ====================================================================== ### MANN-WHITNEY U TEST: DETAILED ANALYTICAL REPORT ### ====================================================================== This analysis investigates whether the perception of being respected in CEUA meetings differs between vegan/vegetarian members and non- vegan/vegetarian members. We are comparing an ordinal dependent variable (Perceived Respect scale) between two independent nominal groups (Vegan Status). The Mann-Whitney U test is the correct non- parametric method for this comparison. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): The distributions of perceived respect scores are IDENTICAL for both groups. H₁ (Alternative Hypothesis): The distributions of perceived respect scores are DIFFERENT for the two groups. ---------------------------------------------------------------------- [2] Descriptive Statistics: A First Look at the Data ---------------------------------------------------------------------- The median score for each group provides a measure of central tendency: Median Score (Vegan/Vegetarian Group): 3.00 Median Score (Non-Vegan/Vegetarian Group): 3.00 ---------------------------------------------------------------------- [3] Inferential Statistics: Testing for Significance ---------------------------------------------------------------------- Mann-Whitney U statistic: 6091.5 Calculated p-value: 0.1186 ---------------------------------------------------------------------- [4] Effect Size: Measuring the Magnitude of the Difference ---------------------------------------------------------------------- Rank-Biserial Correlation (r): 0.1006 Interpretation Guide: |r| ≈ 0.1 (small), |r| ≈ 0.3 (medium), |r| ≈ 0.5 (large effect). ---------------------------------------------------------------------- [5] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.1186) is greater than our significance level of 0.05. Therefore, we FAIL TO REJECT the Null Hypothesis. We do not have sufficient statistical evidence to conclude that a difference in perceived respect exists between the two groups based on this data. ======================================================================
Success: Loaded data for analysis. ====================================================================== ### MANN-WHITNEY U TEST: DETAILED ANALYTICAL REPORT ### ====================================================================== This analysis investigates whether CEUAs with representatives from Animal Protection Societies (SPA) refuse a different number of licenses compared to CEUAs without such representation. We are comparing a numerical (count) dependent variable (Refused Licenses) between two independent nominal groups (SPA Presence). Given the expected non-normal, skewed nature of count data, the Mann-Whitney U test is the appropriate non-parametric method. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): The distributions of the number of refused licenses are IDENTICAL for both groups (with and without SPA representation). H₁ (Alternative Hypothesis): The distributions of the number of refused licenses are DIFFERENT for the two groups. ---------------------------------------------------------------------- [2] Descriptive Statistics: A First Look at the Data ---------------------------------------------------------------------- The median is the most robust measure of central tendency for skewed data: Median Refused Licenses (SPA Present): 1.00 Median Refused Licenses (SPA Not Present): 1.00 ---------------------------------------------------------------------- [3] Inferential Statistics: Testing for Significance ---------------------------------------------------------------------- Mann-Whitney U statistic: 3089.0 Calculated p-value: 0.5488 ---------------------------------------------------------------------- [4] Effect Size: Measuring the Magnitude of the Difference ---------------------------------------------------------------------- Rank-Biserial Correlation (r): -0.0575 Interpretation Guide: |r| ≈ 0.1 (small), |r| ≈ 0.3 (medium), |r| ≈ 0.5 (large effect). ---------------------------------------------------------------------- [5] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.5488) is greater than our significance level of 0.05. Therefore, we FAIL TO REJECT the Null Hypothesis. We do not have sufficient statistical evidence to conclude that a difference in the number of refused licenses exists between the two groups based on this data. ======================================================================
Success: Loaded raw data for analysis using a plain SQL query.
--- Data Preparation Summary ---
Initial respondents loaded 369
Respondents from target institutions (Public, Private, Commercial) 368
...of whom answered 'Não sei dizer' for refused licenses (-135)
Final respondents in analysis (with valid numerical data) 233
----------------------------------------------------------------------
======================================================================
### KRUSKAL-WALLIS H-TEST: DETAILED ANALYTICAL REPORT ###
======================================================================
This analysis investigates whether the number of refused licenses
differs across Public, Private, and Commercial institutions. The
dependent variable ('Refused Licenses') contained mixed data types
(numbers, text ranges, and non-numeric answers). A custom parser was
created to convert this data into a single numerical scale. Text
ranges (e.g., 'Entre 10 e 14') were converted to their midpoint (e.g.,
12), and lower-bound responses (e.g., '20 ou mais') were converted to
their lower bound (e.g., 20). Respondents who answered 'Não sei dizer'
were excluded from this specific statistical test. As we are comparing
a non-normally distributed numerical variable across three independent
groups, the Kruskal-Wallis H-test is the appropriate omnibus method.
----------------------------------------------------------------------
[1] Stating the Hypotheses
----------------------------------------------------------------------
H₀: The distributions of refused licenses are IDENTICAL for all institution types.
H₁: The distribution of refused licenses is DIFFERENT for at least one institution type.
----------------------------------------------------------------------
[2] Descriptive Statistics
----------------------------------------------------------------------
Median Refused Licenses by Institution Type:
Institution_Nature
Commercial 0.0
Public 2.0
Private 3.0
Name: Refused_Licenses_Count, dtype: float64
----------------------------------------------------------------------
[3] Inferential Statistics (Omnibus Test)
----------------------------------------------------------------------
Kruskal-Wallis H-statistic: 12.8063
P-value: 0.0017
----------------------------------------------------------------------
[4] Effect Size
----------------------------------------------------------------------
Epsilon-squared (ε²): 0.0552
Interpretation: ε² estimates the proportion of variance in the ranks
of the scores explained by group membership.
----------------------------------------------------------------------
[5] Analytical Conclusion
----------------------------------------------------------------------
The p-value (0.0017) is less than 0.05. We REJECT the Null Hypothesis.
There is significant evidence that a difference exists among Public,
Private, and Commercial institutions regarding the number of refused
licenses.
Performing Dunn's Post-Hoc Test...
Pairwise p-values (Bonferroni corrected):
Commercial Private Public
Commercial 1.0000 0.0012 0.0025
Private 0.0012 1.0000 1.0000
Public 0.0025 1.0000 1.0000
Significant differences were found between the following groups:
- Commercial and Private (p=0.0012)
- Commercial and Public (p=0.0025)
======================================================================
--- Generating Visualization ---
Success: Loaded raw data for analysis using a plain SQL query. --- Data Preparation Summary --- Initial respondents loaded 369 Respondents from target institutions (Public, Private, Commercial) 368 ...of whom answered 'Não sei dizer' for SPA presence (-36) Final respondents in analysis (with complete data) 332 ---------------------------------------------------------------------- --- Contingency Table (Observed Frequencies) --- SPA_Present No Yes Institution_Nature Commercial 4 11 Private 22 72 Public 50 173 ====================================================================== ### CHI-SQUARED TEST OF INDEPENDENCE: ANALYTICAL REPORT ### ====================================================================== This analysis investigates whether an association exists between the nature of an institution (Public, Private, Commercial) and the presence of an Animal Protection Society (SPA) representative. Since both variables are categorical (nominal), the Chi-Squared (χ²) test of independence is the correct statistical method to determine if the observed proportions differ significantly from what would be expected by chance. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀ (Null Hypothesis): There is NO association between institution type and SPA presence. H₁ (Alternative Hypothesis): There IS an association between institution type and SPA presence. ---------------------------------------------------------------------- [2] Statistical Results ---------------------------------------------------------------------- Chi-Squared Statistic (χ²): 0.1630 P-value: 0.9217 Degrees of Freedom: 2 ---------------------------------------------------------------------- [3] Effect Size ---------------------------------------------------------------------- Cramér's V: 0.0222 Interpretation: Cramér's V measures the strength of association. 0 indicates no association, 1 indicates a perfect association. A common guide is: ~0.1 (small), ~0.3 (medium), ~0.5 (large effect). ---------------------------------------------------------------------- [4] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.9217) is greater than 0.05. We FAIL TO REJECT the Null Hypothesis. There is insufficient evidence to conclude an association exists between the variables. ====================================================================== --- Generating Visualization ---
Success: Loaded and joined data for analysis using a plain SQL query. --- Data Preparation Summary --- Initial respondents loaded 369 Respondents with a valid, codified role 369 ...of whom answered 'Não sei dizer' for perceived respect (-11) Final respondents in analysis (with complete data) 358 ---------------------------------------------------------------------- Final Role Frequencies for Analysis: Role Faculty/Lecturer 97 Veterinarian 91 Researcher 67 Biologist 45 Other 28 Representative (Other Areas) 15 SPA Representative 12 Ad-hoc Consultant 3 Name: count, dtype: int64 ---------------------------------------------------------------------- ====================================================================== ### KRUSKAL-WALLIS H-TEST: DETAILED ANALYTICAL REPORT ### ====================================================================== This analysis investigates whether the perception of being respected differs based on a member's pre-codified professional role. The roles were retrieved by joining the survey data with the 'PapelCEUALookup' table for maximum accuracy. The ordinal 'Perceived Respect' scale was numerically encoded (0=Never to 3=Constantly). The Kruskal-Wallis H-test is the appropriate method for this comparison of an ordinal variable across multiple nominal groups. ---------------------------------------------------------------------- [1] Stating the Hypotheses ---------------------------------------------------------------------- H₀: The distributions of perceived respect scores are IDENTICAL for all professional roles. H₁: The distribution of scores is DIFFERENT for at least one professional role. ---------------------------------------------------------------------- [2] Descriptive Statistics ---------------------------------------------------------------------- Median Perceived Respect Score by Codified Role: Role Ad-hoc Consultant 3.0 Biologist 3.0 Faculty/Lecturer 3.0 Other 3.0 Representative (Other Areas) 3.0 Researcher 3.0 SPA Representative 3.0 Veterinarian 3.0 Name: Perceived_Respect_Score, dtype: float64 ---------------------------------------------------------------------- [3] Inferential Statistics (Omnibus Test) ---------------------------------------------------------------------- Kruskal-Wallis H-statistic: 10.4753 P-value: 0.1632 ---------------------------------------------------------------------- [4] Effect Size ---------------------------------------------------------------------- Epsilon-squared (ε²): 0.0293 Interpretation: ε² estimates the proportion of variance in the ranks of the scores explained by group membership. ---------------------------------------------------------------------- [5] Analytical Conclusion ---------------------------------------------------------------------- The p-value (0.1632) is greater than 0.05. We FAIL TO REJECT the Null Hypothesis. There is insufficient evidence to conclude a difference in perception exists across the groups. ====================================================================== --- Generating Visualization ---
Success: Loaded raw data for analysis using a plain SQL query.
--- Data Preparation Summary ---
Initial respondents loaded 369
...of whom answered 'Não sei dizer' for refused licenses (-135)
Final respondents in analysis (with complete data) 234
----------------------------------------------------------------------
--- Final Group Composition for Analysis ---
RejectionStatus
One or More Rejections 153
Zero Rejections 81
Name: count, dtype: int64
----------------------------------------------------------------------
======================================================================
### MANN-WHITNEY U TEST: DETAILED ANALYTICAL REPORT ###
======================================================================
This analysis investigates if there is a difference in CEUA experience
(in years) between members who have never had a proposal rejected and
those who have had one or more rejected. The complex 'Refused
Licenses' data was parsed and binned into two groups ('Zero
Rejections', 'One or More Rejections'). The Mann-Whitney U test is the
appropriate non-parametric method to compare the distributions of a
continuous variable between two independent groups, as it does not
assume normality.
----------------------------------------------------------------------
[1] Stating the Hypotheses
----------------------------------------------------------------------
H₀: The distributions of experience time are IDENTICAL for both groups.
H₁: The distributions of experience time are DIFFERENT for the two groups.
----------------------------------------------------------------------
[2] Descriptive Statistics
----------------------------------------------------------------------
Median is a robust measure of central tendency for potentially skewed data:
Median Time on CEUA (Zero Rejections): 2.25 years
Median Time on CEUA (One or More Rejections): 3.50 years
----------------------------------------------------------------------
[3] Inferential Statistics
----------------------------------------------------------------------
Mann-Whitney U statistic: 4213.5
Calculated p-value: 0.0000
----------------------------------------------------------------------
[4] Effect Size
----------------------------------------------------------------------
Rank-Biserial Correlation (r): 0.3200
Interpretation: |r| measures the effect size, with ~0.1 (small), ~0.3
(medium), ~0.5 (large).
----------------------------------------------------------------------
[5] Analytical Conclusion
----------------------------------------------------------------------
The p-value (0.0000) is less than 0.05. We REJECT the Null Hypothesis.
There is statistically significant evidence that a difference exists
in the distribution of experience time between the two groups.
======================================================================
--- Generating Visualization ---